19 research outputs found

    Spectrum of Fractal Interpolation Functions

    Full text link
    In this paper we compute the Fourier spectrum of the Fractal Interpolation Functions FIFs as introduced by Michael Barnsley. We show that there is an analytical way to compute them. In this paper we attempt to solve the inverse problem of FIF by using the spectru

    Isometry and convexity in dimensionality reduction

    Get PDF
    The size of data generated every year follows an exponential growth. The number of data points as well as the dimensions have increased dramatically the past 15 years. The gap between the demand from the industry in data processing and the solutions provided by the machine learning community is increasing. Despite the growth in memory and computational power, advanced statistical processing on the order of gigabytes is beyond any possibility. Most sophisticated Machine Learning algorithms require at least quadratic complexity. With the current computer model architecture, algorithms with higher complexity than linear O(N) or O(N logN) are not considered practical. Dimensionality reduction is a challenging problem in machine learning. Often data represented as multidimensional points happen to have high dimensionality. It turns out that the information they carry can be expressed with much less dimensions. Moreover the reduced dimensions of the data can have better interpretability than the original ones. There is a great variety of dimensionality reduction algorithms under the theory of Manifold Learning. Most of the methods such as Isomap, Local Linear Embedding, Local Tangent Space Alignment, Diffusion Maps etc. have been extensively studied under the framework of Kernel Principal Component Analysis (KPCA). In this dissertation we study two current state of the art dimensionality reduction methods, Maximum Variance Unfolding (MVU) and Non-Negative Matrix Factorization (NMF). These two dimensionality reduction methods do not fit under the umbrella of Kernel PCA. MVU is cast as a Semidefinite Program, a modern convex nonlinear optimization algorithm, that offers more flexibility and power compared to iv KPCA. Although MVU and NMF seem to be two disconnected problems, we show that there is a connection between them. Both are special cases of a general nonlinear factorization algorithm that we developed. Two aspects of the algorithms are of particular interest: computational complexity and interpretability. In other words computational complexity answers the question of how fast we can find the best solution of MVU/NMF for large data volumes. Since we are dealing with optimization programs, we need to find the global optimum. Global optimum is strongly connected with the convexity of the problem. Interpretability is strongly connected with local isometry1 that gives meaning in relationships between data points. Another aspect of interpretability is association of data with labeled information. The contributions of this thesis are the following: 1. MVU is modified so that it can scale more efficient. Results are shown on 1 million speech datasets. Limitations of the method are highlighted. 2. An algorithm for fast computations for the furthest neighbors is presented for the first time in the literature. 3. Construction of optimal kernels for Kernel Density Estimation with modern convex programming is presented. For the first time we show that the Leave One Cross Validation (LOOCV) function is quasi-concave. 4. For the first time NMF is formulated as a convex optimization problem 5. An algorithm for the problem of Completely Positive Matrix Factorization is presented. 6. A hybrid algorithm of MVU and NMF the isoNMF is presented combining advantages of both methods. 7. The Isometric Separation Maps (ISM) a variation of MVU that contains classification information is presented. 8. Large scale nonlinear dimensional analysis on the TIMIT speech database is performed. 9. A general nonlinear factorization algorithm is presented based on sequential convex programming. Despite the efforts to scale the proposed methods up to 1 million data points in reasonable time, the gap between the industrial demand and the current state of the art is still orders of magnitude wide.Ph.D.Committee Chair: David Anderson; Committee Co-Chair: Alexander Gray; Committee Member: Anthony Yezzi; Committee Member: Hongyuan Zha; Committee Member: Justin Romberg; Committee Member: Ronald Schafe

    ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution

    Full text link
    Entity resolution (ER), an important and common data cleaning problem, is about detecting data duplicate representations for the same external entities, and merging them into single representations. Relatively recently, declarative rules called "matching dependencies" (MDs) have been proposed for specifying similarity conditions under which attribute values in database records are merged. In this work we show the process and the benefits of integrating four components of ER: (a) Building a classifier for duplicate/non-duplicate record pairs built using machine learning (ML) techniques; (b) Use of MDs for supporting the blocking phase of ML; (c) Record merging on the basis of the classifier results; and (d) The use of the declarative language "LogiQL" -an extended form of Datalog supported by the "LogicBlox" platform- for all activities related to data processing, and the specification and enforcement of MDs.Comment: Final journal version, with some minor technical corrections. Extended version of arXiv:1508.0601

    On algorithmically boosting fixed-point computations

    Full text link
    This paper is a thought experiment on exponentiating algorithms. One of the main contributions of this paper is to show that this idea finds material implementation in exponentiating fixed-point computation algorithms. Various problems in computer science can be cast as instances of computing a fixed point of a map. In this paper, we present a general method of boosting the convergence of iterative fixed-point computations that we call algorithmic boosting, which is a (slight) generalization of algorithmic exponentiation. We first define our method in the general setting of nonlinear maps. Secondly, we restrict attention to convergent linear maps and show that our algorithmic boosting method can set in motion exponential speedups in the convergence rate. Thirdly, we show that algorithmic boosting can convert a (weak) non-convergent iterator to a (strong) convergent one. We then consider a variational approach to algorithmic boosting providing tools to convert a non-convergent continuous flow to a convergent one. We, finally, discuss implementations of the exponential function, an important issue even for the scalar case

    A methodological approach for holistic energy planning using the living lab concept: the case of the prefecture of Karditsa

    Get PDF
    The development of urban and rural landscapes has entered a pioneering era with novel combinations of energy production andconsumption and related changes in the urban and rural fabric including associated socioeconomic issues. Accompanying this change isa realization that newly developing energy initiatives are more viable for development and upscaling and are less vulnerable to failure andresistance from society if they are well integrated into their local and regional contexts. However, institutional questions remain regardingthe required mechanisms and levels of integration, while simultaneously sustainable energy planning requires that the stakeholders withdiverse and conflicting objectives come to some degree of consensus. Inspired by these findings, a methodological approach for holisticenergy planning on a regional/local level was developed within the framework of the INTENSSS-PA project that is funded by HORIZON2020. The approach provides a holistic energy plan, which goes beyond a blueprint for allocating renewable technologies and is basedon the involvement of the wider community. Hence, this approach includes aspects such as the development of spatial concepts, newco-creating strategies, business cases, societal alliances and institutional changes and formats. To implement this approach, the LivingLab (LL) concept is applied. The case of Karditsa, in Greece, will be presented as evidence of the effectiveness of the proposed planningapproach
    corecore